1 research outputs found
The LSST AGN Data Challenge: Selection methods
Development of the Rubin Observatory Legacy Survey of Space and Time (LSST)
includes a series of Data Challenges (DC) arranged by various LSST Scientific
Collaborations (SC) that are taking place during the projects preoperational
phase. The AGN Science Collaboration Data Challenge (AGNSCDC) is a partial
prototype of the expected LSST AGN data, aimed at validating machine learning
approaches for AGN selection and characterization in large surveys like LSST.
The AGNSC-DC took part in 2021 focusing on accuracy, robustness, and
scalability. The training and the blinded datasets were constructed to mimic
the future LSST release catalogs using the data from the Sloan Digital Sky
Survey Stripe 82 region and the XMM-Newton Large Scale Structure Survey region.
Data features were divided into astrometry, photometry, color, morphology,
redshift and class label with the addition of variability features and images.
We present the results of four DC submitted solutions using both classical and
machine learning methods. We systematically test the performance of supervised
(support vector machine, random forest, extreme gradient boosting, artificial
neural network, convolutional neural network) and unsupervised (deep embedding
clustering) models when applied to the problem of classifying/clustering
sources as stars, galaxies or AGNs. We obtained classification accuracy 97.5%
for supervised and clustering accuracy 96.0% for unsupervised models and 95.0%
with a classic approach for a blinded dataset. We find that variability
features significantly improve the accuracy of the trained models and
correlation analysis among different bands enables a fast and inexpensive first
order selection of quasar candidatesComment: Accepted by ApJ. 21 pages, 14 figures, 5 table